Believe It or Not: 
Adding Belief Annotations to Databases 
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ABSTRACT 

We propose a database model that allows users to anno- 
tate data with belief statements. Our motivation comes 
from scientific database applications where a commu- 
nity of users is working together to assemble, revise, 
and curate a shared data repository. As the commu- 
nity accumulates knowledge and the database content 
evolves over time, it may contain conflicting informa- 
tion and members can disagree on the information it 
should store. For example, Alice may believe that a tu- 
ple should be in the database, whereas Bob disagrees. 
He may also insert the reason why he thinks Alice be- 
lieves the tuple should be in the database, and explain 
what he thinks the correct tuple should be instead. 

We propose a formal model for Belief Databases that 
interprets users' annotations as belief statements. These 
annotations can refer both to the base data and to other 
annotations. We give a formal semantics based on a 
fragment of multi-agent epistemic logic and define a 
query language over belief databases. We then prove a 
key technical result, stating that every belief database 
can be encoded as a canonical Kripke structure. We use 
this structure to describe a relational representation of 
belief databases, and give an algorithm for translating 
queries over the belief database into standard relational 
queries. Finally, we report early experimental results 
with our prototype implementation on synthetic data. 



1. INTRODUCTION 

In many sciences today, a community of users is work- 
ing together to assemble, revise, and curate a shared 
data repository. Examples of such collaborations in- 
clude identifying functions of particular regions of ge- 
netic sequences |39| , curating databases of protein func- 
identifying astronomical phenomena on 
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images 43 , and mapping the diversity of species 37 



This is UW CSE Technical Report 08-12-01 in the amended version of Sept 
12th 2009. Pages 1-12 correspond to the final version of PVLDB 2(1): 
1-12 (2009), including modifications to address the reviewer's comments; 
the appendix contains proofs, connection to default logics, and errata. The 
project page is at |http : / /db . cs .Washington . edu/belief DB/| . 



As the community accumulates knowledge and the data- 
base content evolves over time, it may contain conflict- 
ing information and members may disagree on the in- 
formation it should store. Relational database man- 
agement systems (DBMSs) today can help these com- 
munities manage their shared data, but provide limited 
support for managing conflicting facts and conflicting 
opinions about the correctness of the stored data. 

The recent concept of database annotations aims to 
address this need: annotations are commonly seen as su- 
perimposed information that helps to explain, correct, 
or refute base information |36| without actually chang- 
ing it. Annotations have been recognized by scientists as 
an essential feature for new generation database man- 
agement systems [I] [8j [19], and efficient management 
of annotations has become the focus of much recent 
work in the database community [7| |i"T] |i"3] fl5| [23] [24] . 
Still, the semantic distinction between base information 
and annotations remains blurred [To]. Annotations are 
simply additional metadata added to existing data [44] 
without unique and distinctive semantics. 

In discussions with scientists from forestry and bio- 
engineering, we have seen the need for an annotation 
semantics that helps collaborating community members 
engage in a structured discussion on both content and 
each other's annotations: scientists do not only want to 
insert their own annotations but also want to be able to 
respond to other scientists' annotations. Such annota- 
tion semantics creates several challenges for a database 
system. First, it needs to allow for conflicting anno- 
tations: Users should be able to use annotations to 
indicate conflicts between what they believe and what 
others believe. The database should allow and expose 
those conflicts. Second, it should also support higher- 
order annotations. Users should be able to annotate not 
only content but also other users' annotations. And, fi- 
nally, the additional functionality should be supported 
on top of a standard DBMS with a simple extension of 
SQL. Any new annotation model should take advantage 
of existing state-of-the art in query processing. 

To address these challenges, we introduce the concept 
of a belief database. A belief database contains base in- 
formation in the form of ground tuples, annotated with 
belief statements. It represents a set of different be- 
lief worlds, each one for one type of belief annotation, 
i.e. the beliefs of a particular user on ground tuples, 
or on another user's beliefs. These belief worlds follow 
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an open world assumption and may be overlapping and 
partially conflicting with each other. The formal seman- 
tics of belief annotations is defined in terms of multi- 
agent epistemic logic |2l]. This semantics can be rep- 
resented by an appropriate canonical Kripke structure 
which, in turn, can be represented in the standard rela- 
tional model and, hence, on top of a standard RDBMS. 
We also introduce belief conjunctive queries, a simple, 
yet versatile query language that serves as interface to a 
belief database and consists of conjunctive queries with 
belief assertions. In addition to retrieving facts believed 
or not believed by certain users, this language can also 
be used to query for agreements or disagreements be- 
tween users. We describe an algorithm for translat- 
ing belief conjunctive queries into non-recursive Datalog 
(and, hence, to SQL). We have implemented a prototype 
Belief Database Management System (BDMS), and de- 
scribe a set of preliminary experiments validating the 
feasibility of translating belief queries into SQL. 

The structure of this paper follows its contributions: 
• We describe a motivating application, and give ex- 
amples and a syntax for BeliefSQL (Sect. 2 1. 



We define a data model and a query language for 
belief databases (Sect. 3 1. 



We describe the canonical Kripke stru cture th at 
enables implementing belief databases (Sect. 4 1. 



We describe a relational representation of belief 
databases and the translation of queries and up- 
dates over this canonical representation (Sect. 5 1. 



We validate our model and report o n experiments 
with our prototype BDMS (Sect. 6 1. 



The paper ends with an overview of related work ( Sect. 7 1 



and conclusions ( Sect. 8 1 



2. MOTIVATING APPLICATION 

In this section, we present a motivating application 
that we use as running example throughout this paper. 
The scenario is based on the NatureMapping project 
whose goal is to record biodiversity of species in the 
US state of Washington [37] . Participating community 
members volunteer to submit records of animal sightings 
from the field. Each observation includes user-id, date, 
location, species name, and various options to comment 
on the observation, such as details about how the ani- 
mal was identified (e.g., animal tracks were found). As 
sightings are reported by non-experts, they can contain 
errors. In fact, even experts sometimes disagree on the 
exact species of a sighted animal. 

In the current protocol, a single expert in forestry (the 
principal investigator) manually curates all the entries 
before inserting them into the database, which results in 
significant delays and does not allow the application to 
scale to a larger number of volunteers. In this setting, a 
Belief Database Management System (BDMS) can ad- 
dress this challenge by allowing multiple experts to an- 
notate, thus streamlining the curation process. Gradu- 
ate students, technicians, and expert users can all con- 
tribute their beliefs to annotate the data, thus proving 
a collaborative curation process. They can, for example, 
disagree with individual sightings, if in their judgment 
the sighting is incorrect, and annotate the data accord- 



select selectlist 

from (((BELIEF user) + not ? )'' relationname) + 
where conditionlist 

insert into ((BELIEF user) + not ? ) ? relationname 
values 

delete from ((BELIEF user) + not ? ) ? relationname 
where conditionlist 

update ((BELIEF user) + not ? ) ? relationname 
set value assignments 
where conditionlist 



Figure 1: Syntax of query and data manipulation 
commands in BeliefSQL. 



ingly. They can also correct a sighting by annotating it 
with corrected values they believe more plausible than 
those provided by the volunteers in the field. And they 
can also suggest explanations for other users' annota- 
tions, thus leading to higher-order annotations. 

We now illustrate the use of a BDMS. We assume 
three users (Alice, Bob, and Carol) and a simplified 
database schema consisting of three relations: 

Sightings(sid, uid, species, date, location) 
Comments(cid, comment, sid) 
Users(ujd, name) 

We refer to this schema as external schema since it 
presents the way users enter and retrieve data. Beliefs, 
in contrast, are stored transparently from users and can 
be manipulated via natural extensions to standard SQL 
(Fig. 1 1. We illustrate its usage through examples next. 



Little Carol sees a bald eagle during her school trip 
and reports her sighting with the following insert: 

ii:insert into Sightings 

values ('sl'/Carol'.'bald eagle', '6-14-08','Lake Forest') 

Bob, a graduate student, however, does not believe that 
Carol saw a bald eagle: 

i2:insert into BELIEF 'Bob' not Sightings 

values ('sl'/Carol'.'bald eagle', '6-14-08','Lake Forest') 

Additionally, Bob does not believe that Carol could have 
seen a fish eagle, which looks similar to a bald eagle: 

^insert into BELIEF 'Bob' not Sightings 

values ('sl','Carol','fish eagle', '6-14-08', 'Lake Forest') 

This ensures that Bob still disagrees even if Carol's tuple 
is updated to species='fish eagle'. In both cases, Bob 
uses the external key 'si' to refer to the tuple with which 
he disagrees. 

Alice, a field technician, believes there was a crow at 
Lake Placid because she found some black feathers. She 
does not insert a regular tuple as Carol did, but inserts 
only her own belief: 

i4:insert into BELIEF 'Alice' Sightings 

values ('s2', 'Alice', 'crow', '6-14-08', 'Lake Placid') 

i 5 :insert into BELIEF 'Alice' Comments 
values ('cl', 'found feathers', 's2') 
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Bob believes there cannot be any crows in the Lake 
Placid area. He wants to annotate the data with the fol- 
lowing belief statements: (i) Bob believes that Alice saw 
a raven, not a crow; (ii) Bob believes that Alice believed 
that the feathers she found were black; and (iii) Bob be- 
lieves the feathers were actually purple-black, suggest- 
ing they come from a raven, not a crow. The second and 
third belief statements above are Bob's suggestion why 
Alice may have made a mistake. These annotations are 
inserted into the BDMS as follows: 

i6:insert into BELIEF 'Bob' Sightings 

values ('s2' I 'Alice', 'raven', '6-14-08','Lake Placid') 
i7:insert into BELIEF 'Bob' BELIEF 'Alice' Comments 

values ('c2', 'black feathers', 's2') 
i8:insert into BELIEF 'Bob' Comments 

values ('c2', 'purple-black feathers', 's2') 

Notice here the important role of the higher-order be- 
lief statement: "Bob believes that Alice believes that 
the feathers were black"; this is how Bob explains his 
disagreement with Alice. Such explanations are quite 
common in a collaborative data curation process, and 
it is important for a BDMS to support them. 

At this point we have recorded eight belief statements 
in the database. In the following section, we adopt the 
formalism of multi-modal logic [25] and write O u t + for 
the assertion "user u believes tuple t" . |Figure 2] illus- 
trates with our eight statements. Note that in practice, 
a BDMS needs to keep additional information in its in- 
ternal schema, which we describe in |Sect. 5] 

Finally, we illustrate two queries over the belief data- 
base. The first query asks for sightings at Lake Forest 
believed by Bob. It returns ('s2', 'Alice', 'raven'): 

q 1 : select S.skey, S.uid, S. species 

from Users as U, BELIEF U.uid Sightings as S 

where U.name = 'Bob' 

and S. location = 'Lake Forest' 

The second query retrieves entries on which users dis- 
agree with what Alice believes: 

q2'. select U2.name, SI. species, S2. species 
from Users as Ul, Users as U2, 

BELIEF Ul.uid Sightings as Sl, 

BELIEF U2.uid Sightings as S2, 
where Ul.name = 'Alice' 
and Sl.sid = S2.sid 
and SI. species <> S2. species 

The BDMS returns ('Bob', 'crow', 'raven'), implying that 
Bob disagrees with Alice's crow sighting. 

3. FORMAL SETUP 

We introduce here the basic notion of a belief database, 
which enriches a standard database with annotations of 
users' beliefs. Informally, a belief database represents 
a set of incomplete and consistent database instances. 
Depending on which tuples they share or do not share, 
any two such instances can be mutually disjoint, over- 
lapping, contained or partly conflicting. 

Standard relational background. We fix a rela- 
tional schema 1Z — (Ri, . . . R r ) and assume that each 
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Figure 2: Our running example. Left: Ground 
tuples inserted and annotated by different users. 
Conflicting tuples (like the crow and raven tu- 
ples) share the same external key. Internal keys 
(like sli and SI2) uniquely identify each tuple. 
Right: Belief annotations over the ground tuples 
written in the notation of multi-modal logic. 



relation Ri(atta, ■ ■ . ottij 4 ) with arity U has a distin- 
guished primary key attn, for which we alternatively 
write keyt to make the key attribute explicit. In the con- 
text of belief databases, we call 1Z the external schema, 
as this is how users see the non-annotated data, and de- 
note I a conventional database instance without anno- 
tations. An incomplete database is a set of conventional 
database instances {Ii,l2, ■ • •} overafixed schema 1Z |32| 



key{t)^key{t')) 



28 . For each relation Ri, denote Tup { the set of typed 
atomic tuples of the form i?j(ai, . . . a; 4 ). Further de- 
note Tup — [J . Tup i the domain of all tuples or the 
tuple universe of the schema. We further require that 
TuPi H Tupj = where i 7^ j, i.e., each tuple t G Tup 
is uniquely associated with one relation of the schema. 
If t 6 Tup then key(t) represents the typed value of the 
key attribute in t. Using this notation, consistency and 
conventional key constraints are defined as follows: 

Definition 1 (consistency). A database instance 
I over a relation R is consistent iff it satisfies the key 
constraints T(I), i.e. no two tuples from the same rela- 
tion share the same key: 

r(J) = Vi.Vt, t'e Tu Pi .(t, t'ei At^t' 
3.1 Belief worlds 

A belief world is a set of positive and negative beliefs 
of a user about the database content or other user's 
beliefs, and represents a set of consistent database in- 
stances. For example, one belief world is what Alice be- 
lieves, another one is what Bob believes Alice believes. 

Negative beliefs arise naturally when users disagree 
about a ground fact or belief but do not have an al- 
ternative suggestion. In order to allow for such ex- 
plicit negative database entries, the default has to con- 
sider a tuple possible before it is inserted as either pos- 
itive or negative. This default corresponds to the Open 
World Assumption (OWA), and differs from conven- 
tional databases where every tuple that is not in the 
database is considered negated according to the Closed 
World Assumption (CWA) [40] , 

We next give a precise definition and semantics to a 
belief world based on incomplete databases: 
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Definition 2 (belief world). A belief world is a 
pair W — where both I + and I~ are conven- 

tional database instances over the schema 1Z that are, a 
priori, not required to satisfy the key constraints. 

Definition 3 (semantics of a belief world). 
The semantics of a belief world W = (7 + , I~) is the in- 
complete database of instances I over the schema 1Z that 
contain all tuples from I + , contain no tuples from I~ , 
and that satisfy the key constraints. 

[W} = {i\i+ci, m/- = 0, r(/)} 

Definition 4 (consistency of a belief world). 
A belief world W is consistent iff {Wj / 0. 

Proposition 5 (consistency of a belief world) 
A belief world W is consistent iff it satisfies the follow- 
ing two constraints: 



Sightings 



Ti(W0 

r 2 (w) 



r(/ + ) 
Vt e 7 H 



tgr 



The above definitions and proposition state that a 
belief world is represented by two different database in- 
stances. Those have to fulfill two constraints in order 
to represent a consistent set of beliefs: Ti is a standard 
key constraint on 7 + , and F 2 requires that 7 + n I~ — 0. 

It is convenient to represent a belief world by com- 
bining the two instances 7 + and I~ into a single ta- 
ble where each tuple has an additional sign attribute s 
whose v alue is '+ ' for the tuples in 7 + and ' — ' for those 
in r 



Figure 3 



illustrates this with the belief world 
"Bob believes" from our running example. His version 
of Sightings has one positive and two negative records. 
For example, Bob believes that Alice saw a raven (tu- 
ple with sid = 's2'), but he does not believe that Carol 
saw a 'bald eagle' nor a 'fish eagle' (both tuples share 
sid ='sl', hence refer to the same sighting). This exam- 
ple illustrates why I~ does not have to satisfy the key 
constraints: we want to allow a user to disagree with 
more than one alternative. This is needed, for example, 
if Alice adds a belief statement ig with the species 'fish 
eagle' as alternative explanation of Carol's entry i\ : 

i\ : ('si', 'Carol', 'bald eagle', '6-14-08', 'Lake Forest') + 

ig : □AHce('sl', 'Carol', 'fish eagle', '6-14-08', 'Lake Forest') + 

Here, ii and ig represent conflicting positive statements. 
But, in addition, Bob disagrees with both. 

We now define positive and negative beliefs formally. 
Note that they correspond exactly to the concepts of 
certain and impossible tuples in incomplete databases. 

Definition 6 (positive and negative beliefs). 
Let W be a belief world. We say that a tuple t is a pos- 
itive belief for W iff t belongs to all instances in [W] 
and write W \= t + . We say that a tuple t is a nega- 
tive belief for Wifft belongs to no instance in [W] and 
write W |= t~ : 

W \= t + iff V7 6 IW} : t 6 J 

W h r iff V7 € {Wl : t 7 
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Figure 3: Belief world "BELIEF Bob" or n B ob of 
our running example. 

Proposition 7 (positive and negative beliefs). 
Let t be a tuple in Tup^i.e. it is a typed tuple for rela- 
tion Ri . Tuple t is a positive belief for W iff it is in 
I + . It is a negative belief for W iff it is either in I~ 
( "stated negative" ) or if there is another tuple t! £ 7 + 
from Tup i with the same key ("unstated negative"): 

W\=t+ iff tei+ 

W\=t~ iff t£ I' V 3t'£Tup l .{t'e I + At'^tAkey(t')=key(t)) 

3.2 Belief Databases 

A belief database is a collection of belief worlds, one 
for each possible combination of what users believe about 
the database content or other user's beliefs. We use the 



notation of multi-modal logic 25 to express belief state- 
ments. For example, the following statement denotes 
"Alice believes that Bob believes that tuple t is false" : 

□ Bob □Alicct (1) 

Let U be a set of users. In practice, U is a set of 
user IDs, but we simply take U = {1, . . . ,m}. A belief 
path is w £ U* , denoted as w — W[i] ■ ■ ■ Wm. We fur- 
ther restrict belief paths to be g U* with U* = {w £ 
U* | Wu) 7^ i.e. belief paths do not contain the 

same user ids in successive positions. We define a sub- 
path as w\i t j] — W\i] ■ ■ ■ wy] (defined to be e when i > j), 
a suffix as a subpath with W[i,d], where d is the depth 
or belief path length of iu (d = \w\), and we define as 
usual the concatenation of two sequences v ■ w. We use 
O w , D M[1 d] , and ■ ■ D m[dl as equivalent notations. 

Hence, expression (1 1 is equal to □bod-ahcc t~ . 



Definition 8 (Belief Database). (1) A belief 
statement cp is an expression of the form O m t 3 where 
w £ U* is a belief path, t is a ground tuple from the 
tuple universe, and s £ {'+', '— '} is a sign. 

(2) A belief database D is a set of belief statements. 

(3) Given a belief database D and a belief path w. The 
explicit belief world at w is D w = (7,^ , I~ ) with: 

7+ = {t | O w t + £ D} 

I~ = {t | D w t~ £ D} 

(4) A belief database D is consistent iff D w is consis- 
tent for all w 6 U* . 

|Figure 2"| illustrates the belief database from our run- 
ning example with eight belief statements. The explicit 
belief worlds for "Bob believes" and for "Bob believes 
that Alice believes" are: 

TJ B ob = ({s2 2 ,c2 2 },{sl 1 ,sl 2 }) 
T^Bob-Alicc = ({c2i},0) 
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Continuing this example, lets examine what happens 
if a new user Dora joins the discussion. Initially there 
are no belief statements for Dora. In this case, the sys- 
tem needs to assume by default that Dora believes ev- 
erything that is stated explicitly in the database. If we 
didn't do so, then we would force Dora to insert explic- 
itly all tuples she agrees with, which are arguably the 
majority of the tuples in the database. Thus, by de- 
fault, we assume that a user believes every belief state- 
ment that is in the database, unless stated otherwise. 
Dora may later update her belief and disagree explicitly 
with some tuples; the default rule only applies to tuples 
about which Dora has not expressed explicit disagree- 
ment by inserting either a negative belief or a positive 
belief with the same key but different attributes. For 
example, when user 1 inserts a belief statement Di t, 
user 2 will believe by default that user 1 believes what 
he states, i.e. Cb-i t, but not necessarily the fact itself, 
i.e. t. We call this default rule the message board 
assumption in analogy to discussion boards where users 
state and exchange their opinions about facts and each 
other beliefs. We define this formally next. 

Definition 9 (Implicit Beliefs). Given a belief 
database D, define the following sequence D^ 1 ' : 

D m = D 

D (d+i) = D (d) ^ ^ Uilfi | ^ £ D W >i€ Uy 

jjw u {□ i ( (5 }. j s consistent^ 

Definition 10 (Theory). The closure of is 
D = U d >o D{d) ■ We cal1 the set D the theory of D. 

The (infinite) belief database D captures our intended 
semantics: it contains all belief statements explicitly as- 
serted in D together with all statements that follow im- 
plicitly, except if they were explicitly contradicted. 

Lemma 11. If D is consistent, then D is consistent. 

We give now the formal semantics of a belief database, 
by defining the entailment relationship D \= tp. 

Definition 12 (Semantics of a Belief Database). 
A belief database D entails a belief statement cp, in no- 
tation D \= (p, if <p £ D. 



Definition 13 (BCQ syntax). A belief conjunc- 
tive query is an expression of the form 



We illustrate with our running example (Fig. 2 1. Af- 



ter Carol inserted her statement (iiis^), Alice and Bob 
believe the bald eagle sighting by default (D \= □ Alice S{). 
Bob, however, does not want to believe this sighting and 
explicitly states his disagreement (i,2. DBob s^~). While 
he does not believe it himself, he still believes that Alice 
believes this sighting (D \= DBob Aiice sf). 

3.3 Queries over Belief Databases 

We now introduce our language for querying belief 
databases which consists of conjunctive queries extended 
with belief annotations. We call these Belief Conjunc- 
tive Queries (BCQ) and adopt a compact, Datalog-like 
syntax that combines elements from multi-modal logic. 



q(x) 



n^Rl 1 ^),...,^^^ (x g 



consisting of a query head q(x) and g belief atoms or 
modal subgoals forming the query body. Each modal 
subgoal 'Oi^ i R a i i (xij comprises a belief pathiSj, a sign s i; 
and a relational atom Ri(x~i) with relational tuples Xi. 

We call a modal subgoal □ lD ii s (a;) positive if s — '+', 
and negative if s = ' — '. We write x and w for tuples 
and belief paths. They can contain both variables and 
constants. We write var(w) and var(x) to denote the 
variables of w and x. We also allow arithmetic predicates 
in the query body, using standard operators 7^, <, >, 
<, and >. A variable occurrence in a belief path or a 
positive relational atom is called a positive occurrence. 
A query is safe if every variable has at least one positive 
occurrence. We assume all queries to be safe. 

We define next the semantics of a query. We write be- 
low D \= tp-L, . . . ,ip g for Ai(-° \= <Pi), where ipi,...,(p g 
are belief statements. 

Definition 14 (BCQ semantics). Let q be a query 
with head variables x and body variables <!>. The answer 
to q on a belief database D is the following set of tuples 
over the set of constants in the attribute domains: 



{6{x) 



var(§) i — > const, D \= 0($)} 



In other words, for every valuation 6 that maps variables 
to constants in the attribute domains, consider the for- 
mula 9(<&), which is of the form <pi f ...,tp g (one belief 
statement for each subgoal): if D entails 0(&), then we 
return the tuple 9(x). Recall that a belief world can en- 
tail positive and negative beliefs (Def. 6 1. Depending on 



its sign s and its belief path w, each subgoal represents 
positive or negative beliefs of one or more belief worlds. 
A BCQ then asks for constants in relational tuples and 
belief paths that imply positive beliefs in positive sub- 
goals, and negative beliefs in negative subgoals. 

Example 15. Using S for the relation Sightings, the 
following query returns all users x who disagree with any 
of Alice's beliefs, i.e. who have a negative belief about 
some tuple (y, z,u,v,w) , which is a positive belief for 
Alice at the same time. 

q3(x):—D x S~ (y, z,u,v, w),D'A\icc-S + (y, z,u,v, w) 

3.4 Discussion 

Default rules like our message board assumption are 
studied in default logics. In our presentation, we avoided 
introducing default logics, non-monotonic reasoning, and 
stable model semantics, and opted for a simpler defini- 
tion. Yet, an alternative formulation of our message 
boar d as sumption can be given using Reiter's default 
logic [IT] : The set of formulas D that we define in Def. 9 



is provably equal to the provabl y unique stab le model 
for D under the default rule (see appendix C I : 



tp : Oiip 
Uiip 
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Designing an appropriate data and query model for 
belief databases requires a fine tradeoff between tractabil- 
ity and expressiveness. Reasoning in modal logics can 
quickly become intractable [26|. This applies, in partic- 
ular, to fragments that include possibility in addition to 
certainty and impossibility (positive or negative beliefs), 
fn the notation of modal logics, we allow statements of 
the form DAiicei and DAiice - ^ (Alice believes that t is 
necessary or impossible). Complexity would consider- 
able increase by allowing negations before the modal 
operators, e.g. ^□Aiicet (Alice does not believe that t 
is necessary), which is equal to OAiicc^i (Alice believes 
that -it is possible), fn our fragment of modal logics, we 
allow negations only on ground facts, noting that this 
is sufficient to express conflicts. 

The general approach for defining semantics in modal 
logic is through axioms and Kripke structures [21 1 125] . 
Every concrete logic consists of a class of axioms and 
considers formulas that are logical consequences from 
these axioms, where entailment is defined in terms of 
Kripke structures. Often, axioms can be removed by 
restricting the class of Kripke structures. For example, 
the axioms in K5 are equivalent to restricting Kripke 
structures to have accessibility relations that are sym- 
metric and transitive. We have chosen to define the se- 
mantics of a belief database without the aid of axioms 
and Kripke structures, because we felt it is simpler for 
our setting. On the other hand, our definition does not 
lead to an obvious query evaluation procedure. To de- 
rive such a procedure we introduce a particular Kripke 
structure next, and show that it defines a semantics that 
is equivalent to that in |Def. 12~[ 




{s1 1 -,s1 2 ",s2 2 + ,c2 2 + } 



Figure 4: The canonical Kripke structure for our 
running example. 



ably the notions of world id (e.g. #3) or belief path (e.g. 
w = 2 • 1), and those of state or world. 

Consider a belief database D. We define the support 
states as the set of all belief paths w for which D con- 
tains a belief statement over w, and the states as the 
set of all their prefixes: 

Supp(D) = {w G U* | D w ± (0, 0)} 
States(D) = {w G U* \ 3u G U* : w ■ u G Supp(D)} 

For any w G U* we define the suffix states as all the 
suffixes of w that are in States(D), and the deepest suffix 
state (dss) as the suffix state with the longest belief path: 

SuffixStates(w) = {v G States(D) | 3m G U* : u ■ v = w} 
dss(w) = max-arg v {\v\ | v G SuffixStates(w)} 

We can now define formally the canonical Kripke struc- 
ture for a belief database D: 



4. CANONICAL KRIPKE STRUCTURE 

We review here Kripke structures [25] , then define our 
canonical Kripke structure that captures precisely the 
semantics of belief databases (Def. 121. 

A rooted Kripke structure is K — (V, (W v )veV, (Ei)i£u , vo] 
where: 

• V is a finite set called states, 

• W v = (Iy ,Iv) is a belief world associated with 
each state v G V, 

• Ei C V x V is a set of edges or accessibility rela- 
tions associated with each user i 6 U, 

• Wo G V is the root of the Kripke structure. 
Given a rooted Kripke structure K and a state v, the 
entailment relationship (K, v) \= ip is defined recursively 



(K,v)\=t+ 

(k, v) \= r 



(Def. 61 



if W v |= V 
if W v |= t 
ifV(«y) G Ei.(K,v') \=<p 



(Def. 61 



We write K \= ip if (K, vo) \= (p. 

We illustrate with the Kripke structure of |Fig. 4| 
There are four states #0, . . . , #3 with the root #0. 
Consider the belief world at state #2, W#2 = (I #2' ^#2)- 
jj 2 consists of the tuples s22, c22 and 7^ 2 of the tuples 
sli, SI2. We therefore have (K, #2) |= s22- As all edges 
labeled 2 from the root lead to the state #2, we further 
have K \= O2 s22- In the following, we use interchange- 



Definition 16 (Canonical Kripke Structure). 
Let D be a belief database, and denote V = States(D). 
The canonical Kripke structure is K(D) = (V, (D v ) ve v, 
(Ei)i£u , e) , with edges defined as: 

Ei — {(w, dss(w • i)) I w G States(D),w ■ i G U*} 

We describe informally the canonical Kripke structure 
for D. Start with all the belief paths w that are men- 
tioned in some belief statement in D: these form the 
support states. Take all their prefixes: these form all 
states of K(D). Next, for each state v, compute the be- 
lief world D v : this is the belief world for v in the closure 
of D. Although the closure D is an infinite object, the 
set D v is contained in D^ d > where d = \v\. Thus, in or- 
der to compute D v it suffices to compute through 
a finite process, then take D v = D\?\ Finally, edges 
labelled i in K(D) go "forward" from a state w to state 
to • i if the latter exists. Otherwise they go "back" to 
the state with the longest belief path that is a suffix of 
the desired, but missing state w ■ i. That means, edges 
labelled i always go from a state w to dss(w ■ i). 

We prove the following theorem in the appendix: 

Theorem 17 (Canonical Kripke Structure). 

(1) For any belief statement ip, D \= ip iff K(D) \— ip. 

(2) K(D) can be computed in time 0(m d n), where n is 
the size of the belief database D, m the number of users 
and d is the maximum depth of any belief path in D. 
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Figure 5: Relational representation of the canonical Kripke structure for our running example. 



Note that K(D) encodes an infinite number of belief 
worlds with a finite number of states. This provides the 
basis for our query evaluation approach: given a be- 
lief database D, compute its canonical Kripke structure 
K(D), then evaluate queries over K(D). We address 
the latter step in the next section. 

5. TRANSLATION 

This section covers the representation of belief data- 
bases in the standard relational model. In particular, we 
give (1) the representation of the canonical Kripke struc- 
ture, (2) the translation of belief conjunctive queries 
over this representation, and (3) updates of a database. 

5.1 The relational representation 

The relational representation uses an internal schema 
K* = (R*, . . . R*, U, Vl, . . . V r , E, D, S). Recall that the 
first attribute of each content relation Ri(atti) contains 
the external key attribute key i of that relation. Each 
relation Ri is represented by an internal relation R* 
with one additional attribute tid and the relation obey- 
ing the functional dependency tid — > Attr(Ri). The 
internal key constraint is only on this surrogate key: 
R* (tid, keyi , att g , . . . , atti, ) . In addition, the internal 
schema includes: a user relation U with user ids and op- 
tional user attributes; r valuation relations Vj, one for 
each Ri, recording tuples, their keys, signs and whether 
they are explicit or implicit in each belief world (ex- 
plicit means explicitly annotated in contrast to implic- 
itly inferred by the default assumption); an edge relation 
E containing the accessibility relations between worlds 
for each user; a depth relation D recording the nesting 
depth of each world id; and a suffix relation S recording 
the deepest suffix state for each world. Relations D and 
S record information that is used during updates of the 
database. 

The representation of the canonical Kripke structure 
is then straight forward: For each world w £ States(D) 
we create a unique world identifier wid(w) and insert it 
into relation D together with its nesting depth: 

D(wid(w), \w\) 

Analogously, create one entry in relation S that records 
the deepest suffix state for each world: 

S(wid(w), wid(dss(wn t d])*)) 

Each tuple Ri(k, X2, ■ ■ ■ , xi f ) of any world is inserted as 

R*(t,k,x 2 , . . . ,xi t ) , 



where t is its unique internal key. Note that R* gath- 
ers tuples from Ri of all worlds. All worlds in which t 
appears are recorded in the valuation relation Vj as 

Vi(wid(w), t, k, s, e) , 

where s is its sign '+' if positive or ' — ' if negative, and 
e is 'y' or ' n ' depending on whether the tuple is explicit 
or not in the particular world. This attribute indirectly 
records the "provenance" for each tuple in a world (ex- 
plicitly asserted or implicitly inferred by the message 
board assumption) and is needed during updates; it im- 
plicitly tracks the origin world of an implicit tuple and 
allows to determine precedence in case of updates with 
inconsistent values. The external key k is included in 
the valuation relations in order to detect conflicts be- 
tween different belief worlds by merely inspecting the 
valuation relations and, thereby, to increase efficiency 
during updates. Finally, for each (u,v) £ Ej, insert an 
entry into relation E: 

E(u,j,v) 



Fig ure 5] shows the representation of our running ex- 
ample. Attributes wid stand for world id, tid for tuple 
id, uid for user id, s for sign, e for explicitness, and d 
for nesting depth. 

5.2 Query translation 

We next describe the translation of any belief con- 
junctive query into non-recursive Datalog over the in- 
ternal schema: The translation first creates one tempo- 
rary tables for ea ch subgoal and then creates one query 
over these tables (Algorithm 1 1. 



Recall that a BCQ consists of g positive or nega- 
tive modal subgoals and optional additional arithmetic 
atoms <Def. 13|. Conceptually, each positive subgoal 



represents a subquery for positive beliefs, and each neg- 
ative subgoal for negative beliefs. A belief conjunctive 
query then asks for constants in relational tuples and 
belief paths that imply positive beliefs in positive sub- 
goals, and negative beliefs in negative subgoals. Also 
recall from |Prop. 7| that a negative belief can be either 
stated negative, i.e. due to an explicitly stated negative 
belief t~ , or unstated negative, i.e. due to an explicitly 
stated positive belief t' + , where tuple t' has the same 
key as t. Both of these cases have to be considered 
during query translation, which makes the translation 
for negative subgoals more complex, requiring nested 
disjunctions with negation. Also note that a negative 
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Algorithm 1: Translation of any BCQ over the 
canonical belief representation. 

Input: BCQ q(x) : - ii* 1 . . . , R S g g (x g ) 

Output: Translated query Q(x) over temporary tables 

1 Check safety of query: 

Vo e var(q) : a e (IJi var(Wi)) U ( Ui.( Sj =>+>) var(Si)) 

2 For each subgoal i, create a temporary table T»: 
Ti(wi,x, s):-E*(0, v)i, z), V(z, t, _, s, _), fltft, x) 

3 Compose the final query with one temporary table Tj and 
one condition Cj for each subgoal i ... 

q(x) :- Ti(wti,x t i,s t i), . . . , T g (wt g , x tg , st g ), C\, . ■■ ,C g 

4 ... where conditions for positive subgoals are: 



1, 



Ajtl-tj x tj[i] — x i{J]' 
5 ... and conditions for negative subgoals are: 

°i = (Aj:l-d, *ti[j] = ">i[j]).*ti[i] = *t[l]i 

((*« = 0, Ai:2-!j S ti[2,j] = S i,[2,j]) 

V(s t i = 1, Vj:2-I 4 *«[2,j] 7^ 5 i[2,j]))> 



subgoal alone is unsafe, since a single positive tuple in a 
belief world implies negative beliefs for all tuples from 
the same tuple universe with the same key. 

The algorithm first verifies safety: each variable of the 
query has to appear in a belief path or the relational 
tuples of a positive subgoal Q. It then creates, for 
each subgoal □ t [j i i?° , (Si), a temporary table Tj with 
E*(y, w, z) being a notational shortcut for 

E*{y,w,z) = f E(y,W[i],zi), . . . ,E(zd-i,W[d],z) , 

dcf 

with z — y for w — e This table has arity U + di + 1 
and includes all stated tuples for all worlds with belief 
path w. Recall that w can have both constants and 
variable, so that an intermediate table can encode the 
valuations for more than one belief world. Note that 
we cannot perform arbitrary selections and projections 
for negative subgoals at this point, even if x includes 
constants. Any positive tuple can lead to another tuple 
being impossible that may actually be required to be 
joined with another positive or negative subgoal. 

The final query |3| then combines those tables as fol- 
lows: For positive subgoals, it choses positive stated 
tuples (s = 1), and choses constants or joins to other 
subgoals Q. For negative subgoals fE}, it distinguishes 
the case of stated impossible tuples, i.e. s = 0, and un- 
stated impossible tuples, i.e. positive tuples with s = 1 
that share the key to at least another certain tuple in an- 
other positive subgoal. Arithmetic predicates are simply 
added as additional condition to the translated query. 

The following example illustrates this translation. 

Example 18. Assume a relation R(sample, category, 
origin) that classifies empirical samples into a number 
of categories and records their origin. Consider a query 
for disputed samples, i.e. samples x for which at least 
two users y and z disagree on its category or origin: 

q(x,y,z) :— D y R + (x,u,v),D z R~ (x,u,v) 

The query written in BeliefSQL is: 



select Rl. sample, Ul.name, U2.name 
from Users as Ul, Users as U2 

BELIEF Ul.uid R as Rl, 

BELIEF U2.uid not R as R2, 
where Rl. sample = R2. sample 
and Rl. category = R2. category 
and Rl. origin = R2. origin 

The translation over the canonical belief representation 
first creates two intermediate tables: 

Ti(y, x, u, v,s): - E(0, y,zx), V(zi,t, x, s, _), R* (t, _, u, v) 
T2(z, x, u, v, s) E(0, z, Zx), V(zi, t, x, s, J),R*(t, _, u, v) 

The final query then combines those two tables 

Q(x,y,z) : — Ti{y,x,u,v, '+'), T 2 (z, x, ui, «2, S2), 

(S2 = '-' A u 2 = u A V 2 = v) V (s 2 = '+' A(m 2 /uV v 2 ^v)) 

5.3 Updates 

Updates on a belief database consist of several smaller, 
often conditional operations; those operations often in- 
corporate the result of non-recursive queries extended 
with a max-operator over the existing data. As a com- 
pact notation for these updates, we write in the follow- 
ing AR and to refer to a set of tuples that are 
inserted into or deleted from R: 



R n 



(R old - V-R) U AR 



We again use the letter T for temporary tables, and use 
expressions of the form 3(_, _, 2) 6 T as notational short- 
cut for 3x,y.(x,y,z) £ T. In order to specifically refer 
to keys, we write R(k, x') for relational tuples, where x' 
refers to x\ 2 n in R(k, x 2 , . . . , x{). 

Data inserts. Assume a desired insert O w R s (k, x), 
i.e. we want to insert a tuple R(k,x') with sign s into 
world w. Such an insert first has to assure that the 
world w already exists before the tuple can be inserted. 



Algorithm 2 (id World) does so by verifying that the path 



w from the root leads to a world at depth d = 
If not, it recursively verifies that its parent node ex- 
ists j3j. Note that complexity wise, this recursion can 
be unfolded as it happens a maximum of d times. id- 
World then creates a new world id and applies necessary 
operations on the canonical model (|4]-[7|. One such op- 
eration finds the deepest suffix state (dss) of a world 
( [Algorithm 3 1. This procedure needs the max-operator. 
The back link to the dss(w) is stored in relation S pj. 
After creating a new world, idWorld inserts all tuples 
from dss(w) as implicit tuples |9j|. 

Given the world id y of w, insert Tuple (Algorithm 4 1 



first verifies if the tuple (_, k, x') already exists in R ; 
if not, it creates a new entry (TTj) . It then inserts the 
tuple into world w only if this update is consistent with 
existing explicit beliefs (JSJ. If inserted, insert Tuple also 
has to verify possible updates in all dependent worlds of 
w (JsJ) . Dependent worlds are those for which a; is a suffix 
state. In order of increasing depth, it verifies for each 
dependent world z |9]l that an update has no explicit 
conflict in z ( |12[ | and no conflict in the dss of z ( |14[ ). If 
there are no such conflicts, the tuple gets inserted and 
overwrites any existing implicit conflicting beliefs. 
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Algorithm 2: (idWorld) Returns the identifier x of ; 
world w. Creates new world if it does not exist yet. 

Input: World belief path w 
Output: World id x = wid(w) 

1 Define d = \w\; check that depth of x is d: 
T(x):-E*(0,w,x),D(x,d) 

2 if T is empty then 

3 Get the parent id: 
x' = idWorld(«j [lid _ 1] ) 

Create a new id x for ui and a new entry in D: 
AD(x,d) 

Redirect the w^-edge from x' to x: 
\yE(x',w [d] , _), AE(x', w [d] , x) 

For all users u except wy,n , create a n-edge from x to 
the deepest suffix state of wu: 
AB(i, it, dss(«j • u)):— U(u, ...), u^w\m 
For all worlds v 'W[i it j-i] for which w is the deepest 
suffix state for v-w, update the w^-edge: 
VE(y,w [d] ,.):-E*(v,w [ltd _ 1] ,y),D(y,r),r > d, 

E(y,w [d] ,z),D(z,p),p < d 
A%«i[i],i):-£'(»,t«[i,d-i],j) I f(t) I r),r > d, 

E(y,w [d] ,z),D(z,p),p < d 
Create backlink to deepest suffix in S: 
AS(x,dss(w [2 ,d])) 

Insert all implicit tuples into new world w: 

AV(x, t, y, s, 'n'):-S(x, z), V(z, t, y, s, _) 

10 return x 



Algorithm 3: (dss) Returns the world id of the deep- 
est suffix state for belief path w. 

Input: World belief path w 
Output: 2 = wid(dss(w)) 

Query ids z and depths d of all suffix worlds: 
for p = 1 . . . (d + 1) do 

\T(z,y):-E*{Q,w [pA ,z),D{z,y) 

Return the id z of the world with maximum depth: 
return z from T(z, d) where d = max(d) 



Other updates. For a new user insert, first a new 
entry in relation U with a unique uid has to be added: 
AU(u, . . .). Then, back edges from each world to the 
root have to be added: AE(x, u, 0):— D(x, _). Delete 
operations follow a similar semantics as inserts. 

5.4 Space complexity 

We next give theoretic bounds for the size of a BDMS 
in the number of tuples in the underlying RDBMS. Let 
m be the number of users, n be the number of annota- 
tions, d the average depth of belief annotations, and N 
the number of states in the canonical Kripke structure. 
Sizes of relations are \U\ = m, \D\ = N, \S\ = N - 1, 
\R*\ — 0(n), and \E\ = 0(mN). An insert into world 
w can create up to N^, entries in table V, where is 1 
plus the number of worlds for which w is a suffix state. 
For the root e, Aff = N, and hence, an insert at the root 
can create up to N inserts into V. Hence, \V\ — G(nN), 
and the overall database size \1Z*\ = 0((n + m)N). 

In theory, JV is only loosely bounded by 0(nd) with 
the average depth of annotations d as the number of 



Algorithm 4: (insert Tuple) Inserts signed tuple 
R s (k,x') into existing world w if insert is consistent. 
Returns the success of insert attempt. 

Input: World belief pathu) and idy, signed tuple R a {k,x') 
Output: Success 

Get existing or create new internal key t for tuple R(k, x'): 
AR*(t,k,x') 

Get all tuples of world y with key fc: 

If t s is already explicitly present in the world: 
if (t, s, 'y') S Ti then return false 
If t s is already implicitly present in the world: 
if (t, s, V) G Ti then 

\j\jV(y, t, k, s, 'n'), A V(y, t, k, s, 'y')> return true 
If t does not conflict with an existing explicit tuple ... 

if s = '+' a jB(t, y) e Ti a '+', y) e n or 

s = '-' A fl(t, '+', 'y') e Ti then 
... delete any conflicting implicit tuples: 
if s = '+' then v%f,V-','n'),V%-V+',V) 
if s = ' — ' then S/V(y, t, k, '+', 'n') 
... insert t s into y: 
AV(y,t,k,s,Y) 

... get all dependent worlds of w and their depth: 
T 2 (z,r):-E*(_,w,z),D(z,r),r > d 
... then, for each dependent world z in order of depth: 
foreach z 6 T2 in ascending order of r do 
10 Get all tuples of world z with key k: 

T 3 (t",s",e"):-V(z,t",k,s",e") 
1 I Insert t s into world 2 if there is no conflict: 

if s='+' a jB(t,'-',.)eT 3 a / a(_,'+',_)eT 3 or 

s = '-' A fl(t, '+',_) GT 3 then AV(z, t, xi, s, V) 

12 Otherwise if conflicts are not explicit: 
else if s = '+'A fl(t,' -','?) 6 T 3 A ^(.,'+','y') 6 T 3 

or s = '-' A ^(t.'+'.'y'jeTa then 

13 Get tuples with key k from dss(2): 
T 4 (*"' , s'"):-S(z, v),V{v, t'" , k, s'", _) 

1 i Update 2 if there are no conflicts with dss(z): 

If s = '+' A ^(t,'-') 6 T 4 A / a(_,'+') e T 4 then 
VV(j/, t, fe, V),v% k, '+', 'n') 
A%t,fe,'+Vn') 
If s ='-' A y a(t,'+') e T 4 then 
VV(j/,t,fc,'+','n'), 
AV(i/ 1 t,fc, , -' ) 'n') 

return true 
else 
|_return /ake 



prefixes, hence, possible states. However, for bounded 
nesting depth of belief paths (|w| < <i max ), we have 
N = 0(m d ™>^) as the number of possible different belief 
paths |{7 dm a*| with depth up to d max , which is constant 
in n. We then have \TZ*\ = 0((n + m)m dmax ), which 
becomes 0(n ■ m dmax ) for n » m. 

We call the factor the relative overhead in size for 
adding beliefs to databases. We have seen above that 
this factor is C(m dmax ) in the worst case, which is quite 
significant. For example, it is around 10,000 for a belief 
database with m = 100 users and belief annotations 
of depth up to cZmax = 2. In practice, however, the 
overhead heavily depends on the number of belief worlds 
affected by inserts, which, in turn, depends on skews 
in the underlying annotations. We will illustrate these 
effects on the size of a BDMS by varying parameters in 
the annotation data in the next section. 
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Table 1: Relative overhead L of the size of a 

n 

belief database for n = 10, 000 annotations, 10 or 
100 users, varying user participation (Zipf or uni- 
form) and 3 distributions of annotation depth. 
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6. EVALUATION 

We have implemented a prototype of a BDMS that 
allows bulk insertions of belief annotations and trans- 
lations of belief conjunctive queries into SQL. We use 
this prototype to experimentally study (i) the relative 
overhead of managing annotations and (ii) query per- 
formance. The program to generate annotations and 
to translate queries is implemented in Java and uses 
JDBC for calls to a RDBMS. As experimental plat- 
form, we run Microsoft SQL server 2005 on a Dual- 
Xeon machine (3GHz) with 4G of main memory. We 
use the database schema from our running example of 
|Fig. 5| neglecting the comments table for the experi- 
ments: TV = (S* ,U,Vs, E, L, H) and measure the size 
as the number of all tuples in the database {\TV |). Clus- 
tered indexes are available over the internal keys. All 
experiments are performed on synthetic data. 

6.1 Size of a BDMS 

We have seen in ISect. 5.4l that the relative overhead 
for adding beliefs to a database (^r^), l - e - ^ ne number 
of tuples in the database per number of belief annota- 
tions, is e>(m dmi * x ), which is 100 for m = 10 and 10, 000 
for m = 100 users, and belief annotations of maximum 
depth d max = 2. In practice, the skew in the anno- 
tation, i.e. the distribution of the path length k, and 
the distinct count of belief paths can reduce this over- 
head dramatically. To study this dependency, we use a 
generic annotation generator that creates parameterized 
belief annotations. We model annotation skew as dis- 
crete probability distributions Pr[fc = x] of the nesting 
depth of annotations (e.g. 1% of annotations are of nest- 
ing depth 2) and user participation as either uniform or 
following a generalized Zipf distribution (e.g. user 1 is 
responsible for 50% of all annotations, user 2 for 25%, 
. . . ). |Table l| shows the relative overhead of synthetic 
belief databases (each value averaged over 10 databases 
with the same parameters) and illustrates its variations 
with different distributions. |Figure 6"l further shows that 
the relative overhead can actually increase or decrease 
with the number of annotations n. The decrease for the 
lower more skewed distribution arises from the decreas- 
ing relative overhead for supporting a constant number 
of users m for increasing n: 0( n ^ m m dmax ). Also note 
that, despite the upper blue graph suggesting an expo- 
nentially increasing relative overhead, it flattens again 
and will not surpass its theoretic bound of 10,000 in the 
limit. The take-away of this experiment is that the ac- 
tual overhead of belief annotations can be significantly 
lower than their theoretic bound. But it is still substan- 



1E+4 




1E+1 1E+2 1E+3 1E+4 

Distribution of belief 
Number of annotations (n) patn depths (Pr[k=x]) 

Figure 6: L -^- 1 crucially depends on the anno- 
tation skew and can either increase or decrease 
with n (100 users with uniform participation). 



tial and efficient techniques are needed to create more 
compact representations of belief databases. We shortly 
discuss future work on alternative representations at the 
end of this section. 

6.2 Query complexity 

In the following, we list 3 example queries. These 
queries cover the typical usage patterns in a BDMS and 
illustrate the enriched query semantics it can support. 

1. The first type is a query for content. It asks for 
the content of a particular belief world and is of 
the form "What does Alice believe?" In addition, 
we vary the depth of its belief path d £ {0, . . . , 4}: 

qi,d( x >v) ■ - a w S+(x,-,y,-,-), with \w\ 6 {0, . . . , 4} 

2. The second type is a query for conflicts. It asks 
for conflicts between belief worlds and corresponds 
to: "Which animal sightings does Bob believe that 
Alice believes, which he does not believe himself?" 

q2(x,y) :— □2.iS + (a;, z, y, u, v), □2s 1- (x, z, y, u, v) 

3. The third query is an example of a query for users, 
i.e. a query that explicitly includes a user id as 
variable in the answer. It corresponds to: "Who 
disagrees with any of Alice's beliefs of sightings at 
Lake Placid?" Note that the query variable only 
appears in the belief path of a negative subgoal. 

q3(x):— O x S~ (y,z,u,v, 'a'), □iS'"' - (y, z,u,v, 'a') 

The evaluation time scales roughly linear with the size 
of the BDMS (\TV\). In |Table 2| we report the size of 
the result sets, average query times and standard de- 
viation for a belief database with 10,000 annotations 
and 224,339 tuples (relative overhead 22.4). Each query 
was executed 1,000 times. Before each query execution, 
we clear all database caches with SQL server specific 
commands. Remaining variations in execution times of 
identical queries result from fluctuations in the OS be- 
yond our control. The runtimes are in the hundreds 
of milliseconds. Content query qi is clearly fastest as 
it ranges only over one world. The execution time in- 
creases by adding 1 join with relation E (qi,o to qi,i) 
but then remains stable for 2 to 4 joins (^1,2 to §1,4) as 
E is small compared to \TV\. Conflict query qi is slower 
as it has two subgoals, one of which is negative, which 
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Table 2: Execution times and size of result sets 
for our seven example queries executed over a 
belief database with 10,000 annotations. 



Several hardness results on 
re well known [26 31 35 . 
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Result size 


1626 


2816 


2253 


2061 


1931 


196 


99 



requires evaluation of nested disjunctions. The query 
for users 53 is slowest as it includes a negative subgoal 
and ranges over the belief worlds of all users. 

Overall, our experiments suggest that queries in a 
BDMS can be executed in reasonable amount of time 
(i.e. milliseconds) on top of a standard RDBMS. 

6.3 Future Work 

The dominant research challenge is to find techniques 
to decrease the relative overhead of belief databases. 
Recall that this overhead arises as result of the default 
assumption. For example, a generally accepted fact is, 
by default, believed by every user, and gets inserted 
into their respective belief worlds. More precisely, our 
current canonical Kripke structure stores D, the set of 
all entailed beliefs, which means that it applies eagerly 
all instances of the default rule to D; this causes the 
database to increase. An alternative approach is to ap- 
ply the default rule only selectively, or not at all, and to 
apply it only during query evaluation. This will compli- 
cate the query translation, but, at the same time, will 
drastically reduce the size of the database. 

At the same time, a careful analysis and categoriza- 
tion of types of queries that are common in community 
databases will allow to optimize query time for certain 
queries. For example, conflict queries commonly focus 
on tuple-wide conflicts. Modeling functional dependen- 
cies between attributes during query translation will al- 
low to back-chase tuple-wide attribute joins and reduce 
the number of necessary join attributes. 

We are currently exploring these tradeoffs. 



7. RELATED WORK 

Work on annotations management in databases is of- 



ten intertwined with provenance management 1 1 study- 
ing the propagation of annotations during query evalu- 
ation [12| |13| . In those contexts, annotations on data 
are commonly understood today as superimposed infor- 
mation that helps to explain, correct, or refute the base 



information 17 36 . They are sometimes interpreted as 



colors, alternative ly a pplied to individual values [7 
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sets of values |23| [24] or as bundled tuples in tree frag- 
ments [ll] . The important role that annotations play in 
science has been pointed out several times [i] [8] |10| |19| 
|20| . In all those settings, the semantic distinction be- 
tween base information and annotations has remained 
blurred 10 . Annotations are simply additional data 
added to existing data [44]. In contrast, we propose 
to give a concrete semantics to annotations that helps 
users engage in a structured discussion on content and 
each other's annotations. 



inference in modal logic 
Applications to reason- 
ing about knowledge of agents in distributed settings is 
summarized in [2l] . Modal logics have also been consid- 
ered in databases before: Calvanese et al. [14] use the 
modal logic K45^ to manage conflicts in a peer-to-peer 
system. Modalities are used to allow mappings between 
peers to exist even in the presence of conflicts. The work 
shares with us the common goal of using modalities to 
manage conflicting tuples, e.g. key violations. How- 
ever, it differs as follows: (i) our modalities are part 
of the data. Users can add modalities to the data and 
ask queries with modalities to extract the desired facts 
from the database; (ii) in 



14 



the number of modalities 
is proportional to the size of the schema. In our case 
their number is proportional to the database; (iii) [14| 
considers only modalities of nesting depth 1. We allow 
arbitrary depth; and (iv) in [l4] inference is in coNP. In 
contrast, ours is in PTIME (the difference comes from 
the fact that we restrict the use of negation to atomic 
tuples only). Another related work by Nguyen [38] con- 
structs finite least Kripke models for different language 
fragments. The described algorithm runs in exponen- 
tial time and returns a model with size 2 0( - n \ where 
n is the size of input. In our work, we consider the 
fragment of certain and impossible behefs and construct 
polynomial canonical representations. We also provide 
powerful insert, deletes, update functionalities to our 
model and can translate it into standard relations. 

Another work that considers key violations is 22 . 
Here the approach differs from ours: key violations are 
allowed in the database, and are only resolved at query 
time through repairs [5]. Repairs are explored automati- 
cally by the system. At a high level, only those answers 
are returned that can be computed without any con- 
flicts; there are no modalities, and hence the users have 
no control over conflict resolution. 

There is a large body of work on managing uncertain 
and incomplete information 
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47 



JJ[2J[5[[16J[18J[28 

For example, Widom [48] describes the Christmas Bird 
Count as motivation, which is similar to our motivating 
scenario. Our work shares a similar motivation for in- 
formation that is not certain. However, we do not mea- 
sure, track, nor evaluate uncertainty; we rather allow 
conflicting user views and provide means for structured 
discourse inside a database. 

We also share motivation with work on peer-data man- 
agement and collaborative data sharing systems that 
have to deal with conflicting data and lack of consensus 
about which data is correct during integration [6] |29| 



30]|34|. In contrast to these systems, we do not address 
the problem of schema integration. We consider con- 
flicts at the data level within a given common schema. 
Systems like ORCHESTRA [27] [33] [45] enable different 
peers to accept or reject changes to shared data made 
by other peers. Each peer can have its own view of the 
data. This view, however, is materialized once for each 
peer in its separate database instance. In contrast, we 
propose to allow conflicting information to co-exist in 
a single database and we allow users to discuss these 
conflicts. 
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8. CONCLUSIONS 

This paper describes a model of database annotations 
that allows users to annotate both content and other 
users' annotations with beliefs. It allows users to collab- 
oratively contribute and curate a shared data repository 
as common today in large-scale scientific database ap- 
plications. It also allows to explicitly manage conflicts 
and inconsistencies between different users and their 
views. We introduce the concept of belief databases, 
give a concrete application throughout the paper, show 
a polynomial-size encoding of our desired semantics on 
top of relational databases, and validate this concept 
with a prototype and tests on synthetic data. 
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APPENDIX 

A. NOMENCLATURE 



1Z external schema with 1Z = (ill, ■ • • , R r ) 

Tup tuple universe with example tuple t £ Tup 

I standard database instance over the external schema 

W belief world instance with W = (i + , I~) 

[WJ semantics of W as incomplete database 

r key constraints of a standard database instance I 

I\, V 2 consistency constraints for a belief world W 

U set of users {1, . . . , m} with m = \U\ 

(J* U* = {w<=U* \w {l] ?w [i+1] } 

u, v, w belief paths £ U* 

w belief path of zero or more variables and constants 

x tuples of variables and constants 

d nesting depth or belief path length with d = \w\ 

ip belief statement ip = O w t s with s as sign 

D data with belief statements {<pi, . . . , <p n }, n = \D\ 

D w explicit belief world at w (with belief path w) 

D entailment of D 

D w entailed belief world at w 

g number of subgoals of a belief conjunctive query 

K Pointed Kripke structure with 
K = (V,(W v ) vev ,(E l ) ie u,vo) 

V set of states 

W v belief world associated with a state v £ V 

Vo root of the Kripke structure; vo £ V 

Ei set of edges associated with each user i £ U with 
Ei C V X V 

TZ* internal schema; relational representation of a belief 
database 



B.2 Proof Proposition 7 



B. PROOFS 

B.l Proof Proposition 5 



Proposition 5| (consistency of a belief world) 
A belief world W is consistent iff it satisfies the following 
two constraints: 

ri(w) = r(/+) 

r 2 (W) = vt g /+ -.t(jLr 



Proof. (1) We first show Y X {W) A T 2 (W) => [Wj ^ 
0: WLOG, we focus on a fixed key k. The proce- 
dure is implicitly assumed to be repeated for all k £ 
{key(t) 1 1 £ Tup}. From Ti follows that there is either 

(1) no or (ii) one positive tuple with key k in the be- 
lief world W. From T 2 follows that there can be zero 
or more negative tuples with key k in the belief world, 
but none that is already contained as positive tuple. In 
case (i), we can create a consistent complete database 
for key k by making all tuples with this key t.(key — k) 
negative. In case (ii), we make all tuples with the same 
key negative, except for the positive one. 

(2) \W\ £ => ri(W) A F 2 (W): We again focus on a 
fixed key k. A complete database I £ [WJ is consistent, 
if for each key k, there are either zero or one (positive) 
tuples in the database. Therefore Vi. A tuple cannot 
be at the same time be in the database and not be in 
the database. Therefore P 2 . □ 



Proposition 7| (certain and impossible tuples) 
Let t be a tuple in Tup i ,i.e. it is a typed tuple for relation 
Ri . Tuple t is a positive belief for W iff it is in I + . It 
is a negative belief for W iff it is either in I~ ("stated 
negative" ) or if there is another tuple t' £ I + from Tup i 
with the same key ("unstated negative"): 

w\=t + iff tei + 

W\=t~ iff te T\/ 3t'eTu Pl .(t'e I + At'^tAkey(t')^key(t)) 



PROOF. (1) What we have to show is t £ I + <^> / 

c/A/n_r = 0A r(/) a t <£ i). (la) if 

t £ 7+ and I + C I, then tel. Hence, /3I.(t I). 

(lb) <j=: if v/.(/+ c/A/nr=8A r(i) t e I), 

then t £ I + . As if t £" I + , then there is always some 
consistent I .{it' .(key {t') = key(t) => t' £ I)) for which 
t<£I. 

(2) W \= t -JE_t_e I~ V 3? € I + .(key(t') = key(t) A 



Def. 3 



From 

iff VI. (1+ C I Alnl 
have to show 



and 



Def. 6 



we know that W \— t 
A T(I) =>t£I). Hence, we 



t £ I V 3t' £ I + .(key(t') = key(t) At' ^t) 

^ fii.(i + c / A/nr = 0Ar(/)Ate /) 

(2a) =X If t G r then ]3I.(I nT=0A T(J) At el). 
If 3t' £ I + .(key(t') = key(t) At' ^t) then jBI.(I+ C 

i a r(/) At £ /). (2b) ^ : v/.(/ + c i a mr = 

A r(J) => t<£ I) is equivalent to VI. (J n l~ = => t £ 
I) V VI. (J+ C I A r(J) =>t<£ I). The first proposition 
V7.(J n J - = => t ^ /) is true iff t 6 I~. The second 
proposition V/.(/ + C I A T(I) t & I) is true iff 
3t'eI + .(key{t') = key(t)At'^t). □ 

B.3 Proof Theorem 16 

Theorem 16 (Canonical Kripke Structure) 

(1) For any belief statement <p, D \= ip iff K(D) \— ip. 

(2) K(D) can be computed in time 0(m d n), where n is 
the size of the belief database D, m the number of users 
and d is the maximum length of any belief path in D. 



Proof. (1) From |Def. 12| D \= <p <^> <p £ D. Hence, 
it suffices to show that K(D) \= ip <s> tp £ D. We 
proceed in 5 steps: 

(la) We first construct an empty infinite Kripke tree 
frame Tk for U, i.e. a tree with root vo so that for 
each belief path w £ U* , there is exactly one node 
in Tk whose path from the root is w. Therefore, 
for each node at depth d 7^ and incoming i-edge 
(i £ U), we create m — 1 child nodes at depth 
d + 1 for all j 6 U \ {i}. This tree has 1 node at 
depth 0, m nodes at depth 1, m(m — 1) nodes at 
depth 2, in general m(m — l) d_1 nodes at depth 
d. The number of nodes with depth < d is then 
N(d) = l+m+ra(ra-l)+. . .+m(m-l) ' 
the geometric series, we get N(d) = 1 + • 
l) d - 1) = 0(m d ). 

(lb) We next consider Kripke trees, i.e. Kripke tree frames 
with each node corresponding to a belief world with 
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belief path corresponding to the path from the root 
to each node. We define a sequence T K (D^) of 
Kripke trees corresponding to the infinite sequence 
from |Def. 9] 



D 



(o) 



D 



(d+i) 



D 
D 



{d) u{a t p\pe D (d \ieu, 

belief path of Oiip G U* , 
jjW j is consistent} 

where for each p> £ with p> = O w t 3 , we add 
t s to the node with path w in T K (D {,) ). We call 
Tk(D) the canonical Kripke tree. As there is ex- 
actly one node for each belief path w £ U* with the 
path w from the root, it follows that (Td(D^),Vo) \= 
p> <^ T K (D {i) ) \=<pft and only if <p 6 D ( l) . In par- 
ticular, Tk(D) \— if <4> ip g D. Figure 7 shows the 



first two Kripke trees up to depth 2 for our running 
example D from |Fig. 2| 




Figure 7: Sequence of Kripke trees t£> cut off 
at depth 2 for our running example. 

(lc) We next show that subtrees of Tk(D) that do no 
contain States(D) can be pruned and replaced by 
an appropriate back edge, so that for the resulting 
model T' K (D), it holds T' K {D) (= f <^ Tx(-D) |= p: 
Consider the general subtree starting from node #2 
with path w = i • v • j (v, w £ U*;i,j G {e} U ?7) in 



Fig. 8a Assume the subtree contains no States(D), 
i.e. there is no p £ D with belief path w • u (w • u £ 
U*). Then the subtree starting from node #2 is 
isomorph to the subtree starting at node #3 with 
path v ■ j which we show as follows: 

(i) each node at depth d — \w-u\ in subtree #2 can 
be mapped to a node at depth d — 1 in subtree 
at #3 in such a way that the edges of #2 map 
to edges in #3. This follows inductively by 
starting to map node #2 to #3 and repeating 
at each subsequent depth; 

(ii) each belief world at a node in #2 is the same 
as the corresponding belief world at #3. This 
follows from the fact that for each node with 
path w ■ u in #2, D m . u — {}. Hence each tu- 
ple in subtree #2 of Tk (D) is inserted by the 



default rule from lDef. 91 which inserts each tu- 
ple t 3 in if — O v .j. u t s from the node with path 
v ■ j ■ u in #3 into the corresponding node with 
path i • v • j • u in #2. Hence, corresponding 
nodes have the same belief worlds. 
As a consequence, we can create a new Kripke 
tree T' K (D) from T K {D) with pruned subtree #2 
and replaced forward j-edge (#1, #2) by a back 
j-edge (#1 



#3) as shown in Fig. 8b We know 
{T K (D), #2) \=<p<* (T K (D), #3) |=V- It follows: 
T K (D) |= Di. v .j(f O (T K _(D),v ) h Oi-v-w <^ 
(T K (D),#2) \=<p&(T K (D),#3) \=p^(T^(D),v ) \= 
p <4> T' K (D) |= Oi. v .jp>. Hence, the original and the 
pruned Kripke tree have the same theory: Tk(D) j= 
p <4> T' K (D) |= p. Note that node #3 with path 
v ■ j is the node with the largest suffix of node #2 
with path i ■ v ■ j. 



T K (D) 





Figure 8: The pruned Kripke trees T' K (D) and 
T'k(D) have the same theory as T K {D). 

(Id) If the node #3 £ States(D), then #3 can itself be 
pruned according to (lc), i.e. we can replace the j- 
edge (#4, #3) by a j-edge from #4 to the largest 
suffix of #3, say #5. Since the subtrees #3 and #5 
are ismorphic, so are #2 and #3. Hence, we can 
further replace the j-edge (#1, #3) by j-edge (#1, 
#5), and we still have T K {D) |= p «4> T£(D) \= p. 
As a consequence, each forward j-edge (#1, #2) 
with #2 g' States(D) can be replaced with a back 
j-edge (#1, #5), where #5 has the largest suffix 
path of #1 and £ States(D). 

If we repeat this pruning for each edge between a 
node #1 £ States(D) and a node #2 ^ States(D), 
then we get exactly the construction of the canoni- 
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cal Kripke model K(D) in Def. 16 Hence, T K (D) 



ip ^> K{D) \= tp, and tp 6 D <^> K(D) |= tp. 
(2) We first give an alternative construction of K(D) 
that avoids the intermediate infinite Kripke tree and 
then evaluate the complexity of this method. 
(2a) Fix a node in the canonical Kripke tree Tk(D) 
with path w — u ■ v ■ x ■ y (u,v,x,y,w € U*) and 
to | = d. For a tuple t" to be in the node with 
path w, either (i) t s 6 D m or (ii) 3w' = u • v 
with t s £ D w i and VV = u ■ v ■ x : D w » U t s is 
consistent. It follows that the content of a node 
with path w and k = \w\ in the canonical Kripke 
tree Tk{D) (and hence, the world D w ) is the same 
as in the Kripke tree in the canonical Kripke tree 
Tk(D^) (and hence, the world Dffl), and it can 
be deduced by only analyzing all suffix worlds of 
w: {D w , D w ^ 2 d ^ , 



D W[d] ,D £ }. |Figure 9 



that the content of world D«, in the left lower cor- 
ner, and hence D w , can be deduced in the following 
iterative way: start with the root world D e . Insert 
all tuples from e into the belief world D[d\ which 
are consistent wit D[d]- Repeat this step for all 
belief worlds until D w . 



illustrates 




Figure 9: Calculating D w can be done in d = |iw| 
steps by consecutively analyzing all suffix states 
of w with increasing depth and inserting all tu- 
ples from D[ Xi d] into D^.i^j which are consistent 
with the tuples already in Dt^-iM. 

(2b) The algorithm for constructing the whole canoni- 
cal Kripke model K(D) starts with (i) the canoni- 
cal Kripke frame, i.e. all nodes that correspond to 
the States(D), forward edges; (ii) it then inserts 
all back edges to the largest suffixes according to 
Def. 16 (iii) It then determines for each node in 
the model its largest suffix node in order to con- 
structs an inverted suffix tree, i.e. the tree from the 
root node to all other nodes where a node #1 with 
path i ■ v is a child of another node #4 with path v 
if the path of #4 is a suffix of the path of #1 and 
no deeper node is a suffix of #1. (iv) Use either a 
breath-first algorithm that calculates for each state 
the overriding union with its largest suffix state, or 



a depth-first algorithm that traverses an inverted 
suffix tree and calculates the overriding union at 
each step. 
(2c) Complexity: 

(i) Construction of the canonical Kripke frame (i.e. 
the canonical Kripke structure without any tu- 
ples) without the back edges takes 0(nd), as 
for each of n belief statements tp £ D with 
belief path w we need \w\ < d operations. 

(ii) Construction of the back edges takes 0(m d+1 d 2 ) 
time: (a) a canonical Kripke frame with all 
worlds w G U* and \w\ < d has N(d) = 1 + 

nodes. The root 
1 outgoing 

edges. The number of edges is hence E(d) = 



[ „ 2 ((m-l) d -l) = 0(m' 
node has m, each other node m 



m + 



m(m— 1) 



(iii) 



□ 



((m - If - 1) = 0(m d+1 ). The 
number of leaf nodes is m(m— l) d_1 = 0(m d ); 
it hence needs m(m — l) d = 0(m d+1 ) back 
edges. For each such edge, we have to find the 
largest suffix node, which takes d< - d ~ 1 ' 1 — 0(d 2 ) 
in the worst case and with a naive algorithm. 
For each of the 0(m d ) nodes in the Kripke 
frame, determine the node in the model with 
the largest suffix of its path. This can be bound 
by 0(m d (d— 1)) analog to (ii) above. 

(iv) Inserting all implicit beliefs is 0(m d n). For 
that, assume the worst case of a canonical Kripke 
frame with all worlds w £ U* and \w\ < d. In- 
sert of a tuple at the root leads to checking 
for all 0(m d ) nodes in the model. The in- 
sert/check at each node can be performed in 
0(1) with a hash index on the key attribute. 
Hence, 0(m d n). 

(v) Note that n > m(m — l) d_1 for the bound in 
(ii) as minimum number of annotations for the 
canonical Kripke model to include all states 
with w G U* and |tu| < d. Further note that 
(m - l)^ 1 > d 2 for (m > 3,d > 7) or (m > 
4, d > 3) or m > 5. Hence, we can bound 
0(m d+1 d 2 ) by the looser bound 0{m d n), which 
is the overall bound. 



DEFAULT LOGIC 



We shortly review Reiter's default logic 41 before 



drawing the connection to our message board assump- 
tion. We mostly follow the exposition and notation of 
Gottlob [26] and Brachman and Levesque [9]. 

C.l Default logic primer 

A default rule d is a configuration of the form 

a : (3 

; 

LJ 

where a, (3, and to are propositional sentences. Usually, 
a is called the prerequisite, f3 the justification, and uj 
the consequence of d. A default rule d is satisfied by a 
deductively closed set of sentences S if, whenever a is 
an element of S and (3 is consistent with S, then lo also 
is an element of S. A normal default rule is one where 
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justification and consequence are the same: 
a : u! 



A default that contains formulae with free variables is 
sometimes called a default schema. It defines the set 
of all default rules obtained for all ground substitu- 
tions that assign values to all free variables occurring 
in the schema. A propositional default theory is a pair 
T — (W, D), where W is a finite set of propositional 
sentences, sometimes called the background theory, and 
D is a finite set of default rules. 

Informally, an extension E of a default theory (W, D) 
is a grounded minimal deductively closed set of proposi- 
tional formulae containing W and satisfying all defaults 
of D. Hence, extensions are (minimal) fixed points of 
the operator D, namely, that further application of the 
default rules in D to the sentences in an extension has no 
effect. More formally: Let (W, D) be a default theory. 
For any set S of propositional formulae, let T(S) be the 
smallest set U satisfying the following three properties: 

(1) W C U. 

(2) U is deductively closed. 

(3) If G D and a G U and -.0 S, then u 6 U. 
An extension of (W, D) is a fixpoint of F, i.e. a set E of 
propositional formulae satisfying V(E) — E. 

An alternative definition is as follows: Given a default 
theory T = (W, D). A set of sentences E is an extension 
of T if and only if for every sentence tp, 

tpe E iff W U H ^ G D, a G E, -n0 E} \= tp 

An equivalent algorithmic definition is as follows. A 
default — is applicable to a propositional theory W if 
W \— a and WUf3 is consistent. The application of this 
default to W leads to the theory W U to. 

A default theory can have one, more or no extension. 
A normal default theory, i.e. a default theory that has 
only normal default rules, has at least one extension. 

C.2 Default beliefs for belief databases 

We can define the belief theory T of a belief database 
D of a finite set of belief statements as the pair (D, A), 
where A is the set of default assumptions consisting of 
one default schema, the message board assumption 

tp ■ Ojip 
□ ;</5 

In our notation, the extension D of a belief database 
consists of the explicitly annotated belief statements D 
and a set of implied belief statements from the default 
assumption, such that no additional beliefs can be im- 
plied from D that are are consistent with D. We call all 
tp G D the explicit belief statements, and all tp G D \ D 
the implicit belief statements of a belief database. 

Replacing |Def. 9| and |Def. 10| with the following defi- 
nition gives an alternative definition of the semantics of 
a belief database together with |Def. 12| 

Definition 19 (Extension of a belief database). 
Given a belief database D and a set of default assump- 
tions A — {d s } consisting of one normal default schema 




D 

Entailed beliefs 
(extension) 



Figure 10: A belief database "contains" or en- 
tails more than just the explicit belief annota- 
tions. 



where tp and Oitp are belief statements over the external 
schema. A set of sentences D is an extension of a belief 
database if and only if for every sentence tp, 

tp€D iff ipeDV cpe{u \ ^€A,aeD,0 is consistent with D} 



The important consequence of the following lemma 
is that the order in which default rules from the de- 
fault schema are applied does not matter and we have 
one unique stable model for D (observation in Sect. 3.4 1. 
This observation allows an efficient depth-first construc- 
tion of the materialized canonical belief database. 



Lemma 20. If D is consistent, then D has exactly 
one consistent extension D. 

Proof. (1) There is at least one consistent extension: 
An extension E of a default theory is inconsistent if 
and only if the background theory is inconsistent and 
every default theory has at least one extension. As our 
default theory is normal, we have at least one consistent 
extension. 

(2) There is maximal one consistent extension: As- 
sume there exist two consistent extensions D and D' . 
WLOG, there must then be one belief statement ip G D' 
that is not in D. Let tp — O m t s with w = v-i. As D C D 
and D C D' , tp must be implicit and there must be a 
grounded default rule 



di 



n t s 



which is satisfied for D' , but not for D. This can happen 
either because (i) the prerequisite O v t a G D' , but ^ D; 
or (ii) the justification O v .it s is consistent with D' , but 
not with D. We only have to focus on case (ii) as case 
(i) can be reduced to at least one occurrence of case (ii) 
by backchaining. So it suffices to disprove (ii). 

So assume that G D and G D' , but O v .it s is 

consistent with D' and inconsistent with D. For that to 
happen, there must be a belief statement O v .it' s G D 
but ^ D', which is inconsistent with O v .it a and, hence, 
the grounded tuples t 3 and t' s do not fulfill the require- 
ments Ti and T'2 of Prop. 5 for D' w to be consistent. 



This necessarily implicit belief statement can only be in 
D because of another grounded default rule 



d-2 



n i ,s ' 

I \y-iL 



(is 



tp ■ Uip 
Oitp 



For the prerequisite of c?2 to be satisfied, O v t' 3 has to 
be in D. Hence, the Ui v t s and D„i' s would have to be 
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in D, hence the belief world D v is inconsistent which 
contradicts our assumptions. □ 

Some thoughts. (1) In default logic, the extension 
of a logical theory W creates a new logical theory E that 
"extends" W. In contrast, in standard database nomen- 
clature, the extensional database refers to the explicitly 
stored tables and intensional to the relations defined 
or implied by rules. To avoid a possible naming ambi- 
guity, we call explicit all belief statements ip £ D and 
implicit all ip £ D\D. (2) In default logic, D stands 
for the set of default rules, whereas we use D for the 
explicit part of a belief database corresponding to the 
standard usage in database literature. (3) Our default 
schema d s defines infinitely many default rules and an 
infinite extension. (4) Note that consistency is defined 
by extended key constraints and differs from the prepo- 
sitional case: ip U D consistent ^> ->tp §t D. 



C.3 Errata 

This report includes the following corrections over the 
final PVLDB version: 

• Sect. 5.1: Relation S: 

S{wid{w), wid(dss(W[2,d\))) 

instead of 

S(wid(w), wid(dss(w))) 

• Sect. 5.3: 3rd paragraph: 

"Given the world id y of w ..." 

instead of 

"Given the world id x of w ..." 

• Sect. 5.3: Algorithm 2: 



7 For all worlds for which w is the deepest 
suffix state for v-w, update the ru^j-edge: 

8 ASfas,dss(u> [2)d ])) 



instead of 



7 For all worlds v -wy l d _ ji for which w is the deepest 
suffix state, update the ui^-edge: 

8 AS(x, dss(tu)) 



Sect. 5.3: Algorithm 3: 



l for p = 1 . . . 1) do 

[T(z, y):-E* (0, w M , z),D(z, y) 



instead of 



1 for p = 2 . . . (d + 1) do 

\_T(z, y):-E* (0, w {p A , x),D(z, y) 
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